68 research outputs found
A Unified approach to concurrent and parallel algorithms on balanced data structures
Concurrent and parallel algorithms are different. However, in the case of dictionaries, both kinds of algorithms share many
common points. We present a unified approach emphasizing these points. It is based on a careful analysis of the sequential
algorithm, extracting from it the more basic facts, encapsulated later on as local rules. We apply the method to the
insertion algorithms in AVL trees. All the concurrent and parallel insertion algorithms have two main phases. A
percolation phase, moving the keys to be inserted down, and a rebalancing phase. Finally, some other algorithms and
balanced structures are discussed.Postprint (published version
Parallel dictionaries with local rules on AVL and brother trees
We present a set of local rules to deal with dictionaries, having as a main advantage their possible scheduling in a highly
synchronized way to get parallel dictionaries on AVLs. Up to now trees used in massively parallel dictionaries needed to
have all the leaves at the same depth, such as 2--3 trees. Therefore, it was possible (in insertions and deletions) to
reconstruct the tree bottom-up in a very regular fashion, as a pipeline of plane waves moving up. On AVL trees the
situation looks different because leaves can have different depth, therefore any wave in a pipeline is highly irregular. To
solve this problem we define {\it virtual} plane waves allowing us to develop an EREW dictionary for keys with
processors and time . Later on we generalize the sePostprint (published version
Concurrent rebalancing on hyperred-black trees
The HyperRed-Black trees are a relaxed version of Red-Black
trees accepting high degree of concurrency. In the Red-Black trees
consecutive red nodes are forbidden. This restriction has been
withdrawn in the Chromatic trees. They have been introduced by
O.~Nurmi and E.~Soisalon-Soininen to work in a concurrent
environment. A Chromatic tree can have big clusters of red nodes
surrounded by black nodes. Nevertheless, concurrent rebalancing of
Chromatic trees into Red-Black trees has a serious drawback:
in big cluster of red nodes only the top node can be updated. Direct
updating inside the cluster is forbidden. This approach gives us
limited degree of concurrency. The HyperRed-Black trees has been
designed to solve this problem. It is possible to update red nodes in
the inside of a red cluster. In a HyperRed-Black tree nodes can
have a multiplicity of colors; they can be red, black or hyper-red.Postprint (published version
GenExp: An Interactive Web-Based Genomic DAS Client with Client-Side Data Rendering
Background: The Distributed Annotation System (DAS) offers a standard protocol for sharing and integrating annotations on biological sequences. There are more than 1000 DAS sources available and the number is steadily increasing. Clients are an essential part of the DAS system and integrate data from several independent sources in order to create a useful representation to the user. While web-based DAS clients exist, most of them do not have direct interaction capabilities such as dragging and zooming with the mouse. Results: Here we present GenExp, a web based and fully interactive visual DAS client. GenExp is a genome oriented DAS client capable of creating informative representations of genomic data zooming out from base level to complete chromosomes. It proposes a novel approach to genomic data rendering and uses the latest HTML5 web technologies to create the data representation inside the client browser. Thanks to client-side rendering most position changes do not need a network request to the server and so responses to zooming and panning are almost immediate. In GenExp it is possible to explore the genome intuitively moving it with the mouse just like geographical map applications. Additionally, in GenExp it is possible to have more than one data viewer at the same time and to save the current state of the application to revisit it later on. Conclusions: GenExp is a new interactive web-based client for DAS and addresses some of the short-comings of the existin
Fringe analysis for parallel MacroSplit insertion algorithms in 2--3 trees
We extend the fringe analysis (used to study the expected behavior of balanced search trees under sequential insertions) to deal with synchronous parallel insertions on 2--3 trees. Given an insertion of k keys in a tree with n nodes, the fringe evolves following a transition matrix whose coefficients take care of the precise form of the algorithm but does not depend on k or n. The derivation of this matrix uses the binomial transform recently developed by P. Poblete, J. Munro and Th. Papadakis. Due to the complexity of the preceding exact analysis, we develop also two approximations. A first one based on a simplified parallel model, and a second one based on the sequential model.
These two approximated analysis prove that the parallel insertions case does not differ significantly from the sequential case, namely
on the terms O(1/n^2).Postprint (published version
Satellites in the prokaryote world
Background
Satellites or tandem repeats are very abundant in many eukaryotic genomes. Occasionally they have been reported to be present in some prokaryotes, but to our knowledge there is no general comparative study on their occurrence. For this reason we present here an overview of the distribution and properties of satellites in a set of representative species. Our results provide novel insights into the evolutionary relationship between eukaryotes, Archaea and Bacteria.
Results
We have searched all possible satellites present in the NCBI reference group of genomes in Archaea (142 species) and in Bacteria (119 species), detecting 2735 satellites in Archaea and 1067 in Bacteria. We have found that the distribution of satellites is very variable in different organisms. The archaeal Methanosarcina class stands out for the large amount of satellites in their genomes. Satellites from a few species have similar characteristics to those in eukaryotes, but most species have very few satellites: only 21 species in Archaea and 18 in Bacteria have more than 4 satellites/Mb. The distribution of satellites in these species is reminiscent of what is found in eukaryotes, but we find two significant differences: most satellites have a short length and many of them correspond to segments of genes coding for amino acid repeats. Transposition of non-coding satellites throughout the genome occurs rarely: only in the bacteria Leptospira interrogans and the archaea Methanocella conradii we have detected satellite families of transposed satellites with long repeats.
Conclusions
Our results demonstrate that the presence of satellites in the genome is not an exclusive feature of eukaryotes. We have described a few prokaryotes which do contain satellites. We present a discussion on their eventual evolutionary significance.Peer ReviewedPostprint (published version
Unique features of satellite DNA transcription in different tissues of Caenorhabditis elegans
A large part of the genome is known to be transcribed as non-coding DNA including some tandem repeats (satellites) such as telomeric/centromeric satellites in different species. However, there has been no detailed study on the eventual transcription of the interspersed satellites found in many species. In the present paper, we studied the transcription of the abundant DNA satellites in the nematode Caenorhabditis elegans using available RNA-Seq results. We found that many of them have been transcribed, but usually in an irregular manner; different regions of a satellite have been transcribed with variable efficiency. Satellites with a similar repeat sequence also have a different transcription pattern depending on their position in the genome. We also describe the peculiar features of satellites associated with Helitron transposons in C. elegans. Our demonstration that some satellite RNAs are transcribed adds a new family of non-coding RNAs, a new element in the world of RNA interference, with new paths for the control of mRNA translation. This is a field that requires further investigation and will provide a deeper understanding of gene expression and control.This work was supported by grant PID2021-122830OB-C43, funded by MCIN/AEI/10.13039/501100011033 and by “ERDF: A way of making Europe”.Peer ReviewedPostprint (published version
Dna satellites are transcribed as part of the non-coding genome in eukaryotes and bacteria
It has been shown in recent years that many repeated sequences in the genome are expressed as RNA transcripts, although the role of such RNAs is poorly understood. Some isolated and tandem repeats (satellites) have been found to be transcribed, such as mammalian Alu sequences and telomeric/centromeric satellites in different species. However, there is no detailed study on the eventual transcription of the interspersed satellites found in many species. Therefore, we decided to study for the first time the transcription of the abundant DNA satellites in the bacterium Bacillus coagulans and in the nematode Caenorhabditis elegans. We have updated the data for C. elegans satellites using the latest version of the genome. We analyzed the transcription of satellites in both species in available RNA-seq results and found that they are widely transcribed. Our demonstration that satellite RNAs are transcribed adds a new family of non-coding RNAs. This is a field that requires further investigation and will provide a deeper understanding of gene expression and control.This work was supported by Ministerio de Ciencia e Innovación, Spain [Project RTI2018-094403-B-C33 funded by MCIN/ AEI 10.13039/501100011033/ FEDER].Peer ReviewedPostprint (published version
Genome-wide analysis of the emigrant family of MITEs: amplification dynamics and evolution of genes in Arabidopsis thaliana
MITEs are structurally similar to defective class II elements but
their high copy number and the size and sequence conservation of most
MITE families suggest that they can be amplified by a replicative
mechanism. Here we present a genome-wide analysis of the Emigrant
family of MITEs from Arabidopsis thaliana. In order to be able to
detect divergent ancient copies and low copy number subfamilies with a
different internal sequence we have developed a computer program
(http://www.lsi.upc.es/~alggen) that allows looking for Emigrant
elements based solely on its TIR sequence. Our results show that
different bursts of amplification of one or very few active, or
master, elements have occurred at different times during Arabidopsis
evolution, with an insertion dynamics similar to that of some
SINEs. The analysis of the insertion sites of the Emigrant elements
show that, although Emigrant elements tend to integrate far from ORFs,
the elements inserted within or close to genes are preferentially
maintained during evolution.Postprint (published version
easyDAS: Automatic creation of DAS servers
Background: The Distributed Annotation System (DAS) has proven to be a successful way to publish and share
biological data. Although there are more than 750 active registered servers from around 50 organizations, setting
up a DAS server comprises a fair amount of work, making it difficult for many research groups to share their
biological annotations. Given the clear advantage that the generalized sharing of relevant biological data is for the
research community it would be desirable to facilitate the sharing process.
Results: Here we present easyDAS, a web-based system enabling anyone to publish biological annotations with
just some clicks. The system, available at http://www.ebi.ac.uk/panda-srv/easydas is capable of reading different
standard data file formats, process the data and create a new publicly available DAS source in a completely
automated way. The created sources are hosted on the EBI systems and can take advantage of its high storage
capacity and network connection, freeing the data provider from any network management work. easyDAS is an
open source project under the GNU LGPL license.
Conclusions: easyDAS is an automated DAS source creation system which can help many researchers in sharing
their biological data, potentially increasing the amount of relevant biological data available to the scientific
community.Postprint (published version
- …